skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Liehr, Maximilian"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. This work presents the first resistive random access memory (RRAM)-based compute-in-memory (CIM) macro design tailored for genome processing. We analyze and demonstrate two key types of genome processing applications using our developed CIM chip prototype: the state-of-the-art (SOTA) burrows–wheeler transform (BWT)-based DNA short- read alignment and alignment-free mRNA quantification. Our CIM macro is designed and optimized to support the major functions essential to these algorithms, e.g., parallel XNOR operations, count, addition, and parallel bit-wise and operations. The proposed CIM macro prototype is fabricated with monolithic integration of HfO2 RRAM and 65-nm CMOS, achieving 2.07 TOPS/W (tera-operations per second per watt) and 2.12 G suffixes/J (suffixes per joule) at 1.0 V, which is the most energy-efficient solution to date for genome processing. 
    more » « less
  2. In this work, hafnium zirconium oxide (HZO)-based 100 × 100 nm2 ferroelectric tunnel junction (FTJ) devices were implemented on a 300 mm wafer platform, using a baseline 65 nm CMOS process technology. FTJs consisting of TiN/HZO/TiN were integrated in between metal 1 (M1) and via 1 (V1) layers. Cross-sectional transmission electron microscopy and energy dispersive x-ray spectroscopy analysis confirmed the targeted thickness and composition of the FTJ film stack, while grazing incidence, in-plane x-ray diffraction analysis demonstrated the presence of orthorhombic phase Pca21 responsible for ferroelectric polarization observed in HZO films. Current measurement, as a function of voltage for both up- and down-polarization states, yielded a tunneling electroresistance (TER) ratio of 2.28. The device TER ratio and endurance behavior were further optimized by insertion of thin Al2O3 tunnel barrier layer between the bottom electrode (TiN) and ferroelectric switching layer (HZO) by tuning the band offset between HZO and TiN, facilitating on-state tunneling conduction and creating an additional barrier layer in off-state current conduction path. Investigation of current transport mechanism showed that the current in these FTJ devices is dominated by direct tunneling at low electric field (E < 0.4 MV/cm) and by Fowler–Nordheim (F–N) tunneling at high electric field (E > 0.4 MV/cm). The modified FTJ device stack (TiN/Al2O3/HZO/TiN) demonstrated an enhanced TER ratio of ∼5 (2.2× improvement) and endurance up to 106 switching cycles. Write voltage and pulse width dependent trade-off characteristics between TER ratio and maximum endurance cycles (Nc) were established that enabled optimal balance of FTJ switching metrics. The FTJ memory cells also showed multi-level-cell characteristics, i.e., 2 bits/cell storage capability. Based on full 300 mm wafer statistics, a switching yield of >80% was achieved for fabricated FTJ devices demonstrating robustness of fabrication and programming approach used for FTJ performance optimization. The realization of CMOS-compatible nanoscale FTJ devices on 300 mm wafer platform demonstrates the promising potential of high-volume large-scale industrial implementation of FTJ devices for various nonvolatile memory applications. 
    more » « less
  3. RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The non-ideal output from the RRAM macro, due to device and circuit non-idealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multi-level RRAM cells (MLC) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively. 
    more » « less